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ABSTRACT 


The  use  of  3-D  geophysical  models  to  predict  nuclear  test  ban  monitoring  observables  (phase  travel  times, 
amplitudes,  dispersion,  etc.)  is  widely  anticipated  to  provide  improvements  in  the  basic  seismic  monitoring  functions 
of  detection,  association,  location,  discrimination  and  yield  estimation.  A  number  of  questions  arise  when 
contemplating  a  transition  from  1-D,  2-D  and  2.5-D  models  to  constructing  and  using  3-D  models,  among  them: 

1)  Can  a  3-D  geophysical  model  or  a  collection  of  3-D  models  provide  measurably  improved  predictions 
of  seismic  monitoring  observables  over  existing  1-D  models,  or  2-D  and  2y2-D  models  currently  under 
development? 

2)  Is  a  single  model  that  can  predict  all  observables  achievable,  or  must  separate  models  be  devised  for 
each  observable?  How  should  joint  inversion  of  disparate  observable  data  be  performed,  if  required? 

3)  What  are  the  options  for  model  representation?  Are  multi-resolution  models  essential?  How  does 
representation  affect  the  accuracy  and  speed  of  observable  predictions? 

4)  How  should  model  uncertainty  be  estimated,  represented,  and  how  should  it  be  used?  Are  stochastic 
models  desirable? 

5)  What  data  types  should  be  used  to  construct  the  models?  What  quality  control  regime  should  be 
established? 

6)  How  will  3-D  models  be  used  in  operations?  Will  significant  improvements  in  the  basic  monitoring 
functions  result  from  the  use  of  3-D  models?  Will  the  calculation  of  observables  through  3-D  models  be 
fast  enough  for  real-time  use  or  must  a  strategy  of  pre-computation  be  employed? 

7)  What  are  the  theoretical  limits  to  3-D  model  development  (resolution,  uncertainty)  and  performance  in 
predicting  monitoring  observables?  How  closely  can  those  limits  be  approached  with  projected  data 
availability,  station  distribution  and  inverse  methods? 

8)  What  priorities  should  be  placed  on  the  acquisition  of  event  ground  truth  information,  deployment  of 
new  stations,  development  of  new  inverse  techniques,  exploitation  of  large-scale  computing,  and  other 
activities  in  the  pursuit  of  3-D  model  development  and  use? 


In  this  paper,  we  examine  what  technical  issues  must  be  addressed  to  answer  these  questions.  Although  convened 
for  a  somewhat  broader  purpose,  the  June  2007  Workshop  on  Multi-resolution  3D  Earth  Models  held  in  Berkeley, 
CA  also  touched  on  this  topic.  Results  from  the  workshop  are  summarized  briefly  at  the  end  of  this  paper. 
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OBJECTIVES 

The  objective  of  this  paper  is  to  lay  out  technical  considerations  in  the  development  and  use  of  three-dimensional 
(3D)  geophysical  models  for  predicting  the  seismic  observables  used  to  monitor  compliance  with  nuclear  test  ban 
treaties.  These  considerations  could  inform  the  structure  of  a  program  designed  to  develop  a  3D  model  or  models 
intended  for  operational  use. 

RESEARCH  ACCOMPLISHED 

Monitoring  Observables 

For  purposes  of  this  discussion,  the  basic  seismic  monitoring  functions  consist  of  phase  detection,  phase  association, 
event  location,  discrimination,  and  characterization  (i.e.  magnitude  and  yield  estimation).  The  potential  seismic 
observables  useful  for  these  basic  functions  range  from  parametric  data  such  as  phase  amplitudes,  arrival  times  and 
surface  wave  dispersion  characteristics  (e.g.,  group  delay)  to  derived  waveform  characteristics  such  as  the  coda 
envelope,  to,  in  some  cases,  the  full  seismic  waveform.  Treaty  monitoring  is  carried  out  at  local  (<  -200  km), 
regional  (  <  -2000  km)  and  teleseismic  (>  -2000  km)  distances.  The  principal  phases  are  P,  S,  and  surface  waves 
(Rg)  at  local  distances,  and  Pn,  Pg,  Sn,  Lg,  and  the  surface  waves  (Rayleigh,  Love)  at  regional  distances  (Figure  1). 
A  very  large  number  of  phases  are  available  for  monitoring  purposes  at  teleseismic  ranges,  though  the  principal 
phases  that  are  intensively  calibrated  are  the  first  arriving  P  and  S,  and  surface  waves. 


body  phases 

Origin  Time  Pn  Pg  Sn  Lg 


Time  (seconds) 

Figure  1.  Principal  parametric  observables  at  regional  distances  consist  of  arrival  times  and  amplitudes  of  the 
4  major  phases  (top  trace)  Pn,  Pg,  Sn,  Lg,  and  dispersion  characteristics  of  long  and  short  period 
surface  waves  (bottom  traces). 

One  of  the  most  important  factors  affecting  the  development  and  use  of  3D  models  (principally  through  the  required 
resolution  of  the  models)  is  the  frequency  band  in  which  monitoring  observables  are  measured.  The  nuclear  test 
monitoring  band  extends  from  20  seconds  (for  Ms,  magnitude  measurements  from  surface  wave  amplitudes)  to 
10  Hz  or  even  higher  (for  body  wave  amplitude  measurements  used  in  discrimination  and  arrival  time  picks  used  in 
location).  Generally  data  are  not  available  to  constrain  3D  models  sufficiently  to  make  accurate  predictions  of 
waveforms  at  the  high-frequency  end  of  the  spectrum.  However,  it  may  be  possible  to  develop  stochastic  models 
that  predict  bulk  characteristics  of  seismic  waves,  such  as  amplitude  and  coda  shape. 
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Modeling  Objectives:  Accuracy,  Operational  Considerations 

Seismic  events  are  detected  and  interpreted  by  a  process  of  matching  observed  signal  characteristics  with 
characteristics  predicted  by  a  model  from  trial  values  of  the  principal  event  parameters:  location,  origin  time, 
magnitude,  mechanism  and  source  time  history.  Often  the  process  is  iterative,  as  indicated  in  Figure  2.  The  process 
drives  the  residuals  (differences)  between  predicted  and  measured  values  as  close  to  zero  as  possible.  Usually  station 
corrections  for  each  predicted  observable  are  developed  from  observations  of  ground  truth  events.  Station 
corrections  can  be  appreciable  where  models  (whose  geophysical  parameters  vary  relatively  smoothly  compared  to 
the  Earth)  fail  to  predict  observables  closely  enough.  Station  corrections  typically  take  the  form  of  maps  of  residual 
bias  (variation  from  zero)  as  a  function  of  latitude,  longitude,  and  sometimes  depth  of  the  sources  observed  by  the 
station. 


origin  time  Pn  Pg  Sn  Ug 


Figure  2.  Models  form  the  core  of  an  iterative  process  of  predicting  observed  signal  characteristics 

(parameters,  envelopes,  waveforms)  in  order  to  detect,  locate,  identify  and  characterize  seismic 
events. 

The  physical  parameters  comprising  the  models  include  compressional  (P)  and  shear  (S)  velocities,  intrinsic 
attenuation  for  both  P  and  S,  and  medium  density.  In  a  3D  model,  these  parameters  would  be  mapped  as  a  function 
of  latitude,  longitude,  and  depth  from  the  surface.  Additional,  higher-order  physical  parameters  such  as  anisotropy 
may  need  to  be  considered  as  well. 

It  is  desirable  to  concentrate  as  much  as  possible  of  the  predictive  capability  of  this  system  in  the  model,  since  the 
model  will  provide  predictions  for  stations  that  do  not  have  empirical  corrections  either  because  the  stations  are 
newly  deployed  or  because  they  observe  relatively  aseismic  regions.  Clearly,  one  of  the  principal  objectives  in 
building  a  model  is  to  make  predictions  derived  from  it  as  accurate  as  possible. 

However,  accuracy  is  not  the  only  objective.  The  models  of  interest  to  the  monitoring  community  are  intended  to  be 
embedded  in  pipelines  detecting  and  interpreting  events  from  real-time  data  streams.  Consequently,  calculations  of 
observables  from  the  models  must  be  fast  enough  to  keep  pace  with  a  real-time  system.  Often,  as  is  the  case  with 
location,  predictions  of  observables  must  be  made  repeatedly  when  interpreting  a  single  event  as  the  estimation 
algorithm  iterates  to  a  solution  from  a  starting  estimate.  Hundreds  or  thousands  of  evaluations  of  observed 
parameters  may  be  required  to  interpret  a  single  event.  This  requirement  implies,  for  example,  that  travel-time 
estimates  be  calculated  in  just  milliseconds. 
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Geographical  coverage  is  another  issue;  global  capability  is  required,  but  especially  good  predictions  are  desirable  in 
Eurasia  and  Africa. 

In  addition,  the  predictions  made  by  models  must  be  seamless.  Heretofore,  highly  accurate  local  and  regional 
observable  (e.g.,  travel-time)  models  have  been  spliced  together  to  form  a  single  model  spanning  a  large  (global  or 
continental)  region.  This  practice  sometimes  has  led  to  boundaries  between  regions  with  offsets  in  the  predicted 
observables.  These  offsets  have  been  somewhat  artificially  or  arbitrarily  smoothed  together  to  avoid  presenting 
discontinuities  to  estimation  algorithms.  A  desire  to  avoid  such  discontinuities  is  driving  the  monitoring  community 
toward  development  of  increasingly  geographically  comprehensive  models,  ideally  embedding  high-resolution 
regional  geophysical  models  seamlessly  in  global  models.  There  is  a  spreading  consensus  that  any  blending  or 
splicing  that  needs  to  be  done,  should  occur  in  the  geophysical  models  and  not  in  the  derived  predictions  of 
observables.  Because  of  the  heterogeneity  of  regions,  comprehensive  geophysical  models  are  necessarily  going  to  be 
at  least  2D  and  more  generally  3D. 

There  is  a  question  as  to  whether  3D  models  will  provide  sufficiently  improved  accuracy  to  justify  the  added  cost  of 
developing  them  and  the  pipeline  support  to  use  them.  It  is  clear  that  a  patchwork  quilt  of  ID  models  is  not 
sufficiently  accurate,  particularly  around  boundaries  among  regions  described  by  distinct  ID  models  where 
discontinuities  are  an  issue,  and  by  defaulting  to  iasp91  in  regions  where  no  data  are  available.  They  also  are 
complex  to  implement  and  maintain.  Currently  the  National  Nuclear  Security  Administration  (NNSA)  labs  are 
engaged  in  building  2.5D  approximations  to  3D  models  for  predicting  regional  phase  travel  times.  These  are  planned 
to  be  tomographic  models,  which  should  be  seamless  and  provide  greater  predictive  accuracy  than  ID  models 
everywhere,  including  in  aseismic  regions. 

One  Model  or  Many? 

Previous  seismological  practice  in  model  building  was  to  construct  separate  models  for  each  observable  to  be 
predicted,  and  generally  to  use  data  of  the  intended  observable  to  invert  for  the  model.  Hence,  body  wave  travel-time 
observations  were  used  to  construct  predominantly  P  velocity  models  and  surface  wave  dispersion  data  to  construct 
shear  wave  models.  These  models  often  had  their  best  resolutions  at  different  depths  due  to  the  fact  that  body  phases 
sample  the  earth  (travel)  deeper  than  surface  waves.  Measurements  made  from  disparate  wave  types  represent 
different  averages  of  often  distinct  geophysical  parameters. 

Current  practice,  especially  in  global  tomography,  is  to  perform  joint  inversions  of  several  or  even  many  data  types 
to  estimate  one  or  often  several  geophysical  parameters.  The  use  of  multiple  data  types  improves  coverage  by 
providing  constraints  at  different  depths  and  observations  over  a  larger  number  of  (great  circle)  paths. 

While  the  desirability  of  inverting  many  data  types  simultaneously  to  produce  models  is  generally  accepted,  a 
question  remains  about  whether  to  strive  for  a  single  model  to  predict  all  observables  or  individual  models  to  predict 
each  observable  separately.  A  single  model  is  desirable  for  operational  and  maintenance  simplicity,  but  it  may  be 
that  dedicated  models  will  have  greater  prediction  accuracy,  especially  if  they  employ  specialized  representations  for 
the  observable  in  question  (such  as  dispersion  maps  for  surface  wave  group  delay  predictions  instead  of  P  and  S 
velocity  models). 

Model  Representation  Uncertainty 

There  are  a  number  of  separate  issues  to  be  resolved  that  fall  under  the  general  heading  of  model  representation. 
While  not  necessarily  unique  to  3D  models,  these  issues  are  exacerbated  by  “the  curse  of  dimensionality,”  the  rapid 
increase  in  complexity,  and  storage  and  computational  requirements  in  the  transition  to  3D  representations. 

The  first  issue  is  the  mathematical  form  of  representation  for  any  particular  geophysical  quantity.  Options  include 
sampling  the  quantity  on  a  regular  geographic  grid  of  points  or  cells,  sampling  on  a  tessellated  mesh,  and 
representation  in  terms  of  the  coefficients  in  any  basis  function  expansion.  Often  models  are  described  most  simply 
as  a  stack  of  layers  in  depth  in  latitude-longitude  cells;  the  parameters  are  layer  thicknesses,  P  and  S  velocities, 
intrinsic  Q  for  P  and  S  waves,  and  medium  density. 

One  consideration  for  the  choice  of  representation  is  the  fidelity  of  the  model  to  the  Earth  it  represents.  Model 
fidelity  is  influenced  by  the  resolution  of  the  representation  and  the  accuracy  of  suitable  interpolation  schemes. 
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Generally,  the  greater  the  resolution,  the  more  accurate  the  representation  (especially  of  boundaries)  and  the  larger 
the  space  required  to  store  the  model.  The  accuracy  of  the  model,  however,  is  most  fundamentally  limited  by  the 
coverage  of  the  data  used  to  construct  it. 

A  solution,  almost  sure  to  be  employed  in  development  of  large  3D  models,  is  the  use  of  multiple  resolutions.  The 
model  may  be  very  dense  in  the  shallow  regions  of  the  crust  and  lithosphere,  where  heterogeneity  is  greatest,  and 
more  accurate  predictions  of  rapidly- varying  observables  is  desired.  The  model  may  be  sparse  in  the  deeper,  more 
homogeneous  mantle  and  core.  In  the  nuclear  test  ban  application,  some  geographic  regions  will  be  more  accurately 
and  finely  represented  than  others  of  lesser  importance. 

A  second  consideration  is  the  accuracy  of  the  waveform  or  parameter  observable  generated  by  forward  calculation 
through  the  model.  Computed  observable  accuracies  may  require  resolutions  that  exceed  those  required  for  model 
fidelity.  Numerical  requirements  may  dictate  sampling  densities  as  high  as  ten  times  per  wavelength,  often  much 
more  finely  sampled  than  justified  by  available  constraints  on  the  model  implied  by  available  data.  The  choice  of 
model  representation  often  significantly  influences  numerical  accuracy,  as  can  happen  with  the  superior  definition  of 
reflecting  boundaries  finite  or  spectral  element  methods. 

A  third  and  related  consideration  is  the  speed  with  which  observables  can  be  calculated  through  the  model.  Many 
algorithms  required  extraction  of  Earth  model  quantities  along  great  circle  paths  or  along  more  arbitrary  paths  (as 
with  full  ray  tracing).  Such  algorithms  place  a  premium  on  fast  interpolation  methods,  since  they  require 
geophysical  parameter  values  at  points  generally  not  coincident  with  mesh  or  grid  sample  locations.  Tesselation 
meshes  have  to  be  “walked”  to  find  the  specific  tetrahedra  where  model  values  are  required.  Efficiency  of  model 
interrogation  and  interpolation,  accuracy,  and  model  storage  requirements  may  trade  against  each  other. 

The  second  major  issue  is  the  estimation  and  representation  of  model  uncertainty.  In  current  practice,  the  majority  of 
models,  even  the  vast  majority,  are  deterministic.  Where  uncertainties  in  model  parameters  are  estimated  and 
reported  at  all,  usually  Gaussian  statistics  are  assumed  and  model  parameters  are  described  as  the  mean  of  the 
distribution.  The  variance  may  be  estimated  from  a  priori  considerations  of  normal  geophysical  variation,  or  more 
robustly  through  a  statistical  procedure  such  as  the  jackknife  or  bootstrap  during  inversion.  Although  variations 
among  different  model  parameters  are  almost  certainly  correlated,  covariance  estimates  are  not  commonly  reported. 

Stochastic  geophysical  models  [  e.g.  Pasyanos  et  al.,  2006]  are  an  attractive  though  computationally-  and  storage¬ 
intensive  alternative  (particularly  in  3D)  to  deterministic  models.  Stochastic  models  represent  uncertainty  with  a 
collection  of  thousands  of  realizations  that  provide  a  posterior  distribution  on  sample  models  that  are  consistent  with 
available  data  (see  e.g..  Figure  3).  The  density  of  sample  models  in  the  model  parameter  space  constitutes  an 
approximate  probability  distribution  on  the  model  that  can  be  used  to  extract  moments  (mean  model,  variance)  that 
characterize  uncertainty.  This  representation  easily  embraces  non-Gaussian,  and  even  multi-modal,  statistics. 
Relatively  efficient  grid-search  methods  exist  to  produce  the  collection  of  realizations  that  constitute  a  3-D 
stochastic  model  by  fitting  simultaneously  a  variety  of  observation  data.  However,  large-scale  computing  is  required 
to  check  the  thousands  of  representative  sample  3-D  models  against  available  data  by  predicting  a  variety  of 
observables  (surface  wave  dispersion,  body-wave  travel  times,  receiver  functions,  etc.).  Stochastic  models  of  this 
sort  readily  allow  the  estimation  of  uncertainty  in  predicted  observables,  since  multiple  estimates  of  the  observables 
can  be  obtained  by  forward  calculation  through  the  suite  of  sample  realizations.  The  result  is  a  distribution  on  the 
observables,  as  shown  in  Figure  3. 

A  Priori  Model  Development 

Geophysical  models  can  be  developed  in  several  ways,  including  the  compilation  of  existing  information  from 
diverse  sources  such  as  maps,  general  physical  constraints  on  earth  structure  and  specific  geologic  descriptions  from 
prior  studies  into  a  priori  geophysical  models.  Figure  4  below,  shows  various  aspects  of  a  unified  3D  geophysical 
model  of  Eurasia  and  North  Africa  developed  by  the  NNSA  labs.  Eawrence  Eivermore  National  Eaboratory  (EENE) 
constructed  the  western  half  of  the  model,  which  is  designated  the  WENA  (Western  Eurasia,  North  Africa)  model. 
The  WENA  model  is  compiled  from  a  large  number  of  literature  sources  [Pasyanos  et  al,  2004].  A  priori  models  are 
useful  in  themselves,  but  also  may  serve  as  the  starting  point  for  the  other  major  class  of  models,  tomographic 
models,  often  as  Bayesian  priors.  Tomographic  models  are  obtained  by  inverting  mathematically  observations  of 
phase  travel  times,  amplitudes,  dispersion  and  other  characteristics  over  many  crossing  paths  to  constrain  the  3D 
velocity  structure  and  other  model  parameters. 
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Model  Validation 


Model  validation  is  an  essential  component  of  all  model  development  and  acceptance  as  it  tests  the  predictive 
capabilities  of  a  model.  The  fundamental  steps  of  validation  are  the  same  as  in  model  estimation  from  data: 

(1)  collection  of  ground  truth  information  on  events  for  which  seismic  observables  can  be  measured,  (2) 
development  of  procedures  to  predict  those  observables  through  the  model,  (3)  collection  of  waveform  or  other  data 
from  which  to  measure  observables,  (4)  making  the  measurements  and  (5)  comparing  the  measurements  with  the 
model  predictions.  The  additional  step  in  model  construction  not  performed  in  validation  is  adjusting  the  model 
parameters  in  an  iterative  process  to  better  match  the  measurements. 


Figure  3.  Six  realizations  from  a  3D  stochastic  geophysical  model  surrounding  the  Korean  peninsula 

demonstrate  the  concept  that  the  uncertainty  for  the  model  is  captured  by  a  probability  distribution 
represented  by  the  density  of  realizations  in  model  space.  At  upper  right,  distributions  for  predicted 
travel-time  observables  along  a  specific  path  show  the  ability  of  this  type  of  model  to  predict 
uncertainties  in  monitoring  observables. 

An  example  of  travel-time  model  validation  is  shown  in  Figure  5  for  the  WEN  A  model  [Flanagan  et  al.,  2007].  In 
this  case,  the  model  is  used  to  predict  first  P  arrival  times.  Since  this  is  a  3-D  model,  the  travel  times  are  predicted 
with  a  3-D  finite-difference  code  solving  the  eikonal  equations.  This  assessment  shows  that  even  an  a  priori  3-D 
geophysical  model  improves  travel  time  predictions  compared  to  a  standard  (iasp91)  1-D  Earth  model. 

Validation  of  tomographic  (data-driven)  models  presents  a  dilemma:  how  much  of  the  available  ground  truth 
information  and  measurements  should  be  used  to  develop  the  model  and  how  much  should  be  held  out  to  perform 
validation?  To  prevent  circularity  problems,  the  same  data  cannot  be  used  for  both  functions.  Cross  validation  and 
similar  statistical  sampling  techniques  can  obviate  this  problem  at  added  computational  cost.  The  problem  becomes 
particularly  acute  in  the  development  of  3D  models,  because  the  added  dimension  of  such  models  means  that  many 
more  model  parameters  must  be  estimated,  and  more  and  more  data  are  required  to  constrain  the  estimates. 
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Data  Types  Required  for  the  Inverse  Problem 

In  this  paper,  we  use  the  term  “inverse  problem”  to  mean  the  problem  of  estimating  geophysical  model  parameters 
from  seismic  and  other  observables.  By  contrast,  the  forward  problem  is  that  of  predicting  the  observables  from  an 
extant  model.  The  data  types  required  or  available  to  solve  the  inverse  problem  are  many  and  varied.  They  include 
measurements  of  the  observable  types  that  the  models  are  intended  to  predict:  P  and  S  absolute  and  differential 
travel  times,  P  and  S  wave  amplitudes,  and  surface  wave  dispersion.  They  also  include  other  types  of  data  that 
provide  additional,  often  complementary  constraints:  receiver  functions,  structural  constraints  from  refraction  and 
reflection  seismology,  gravity  anomaly  data  from  satellite  orbital  perturbations,  heat  flow,  normal  mode  peak 
splitting,  and  constraints  from  geologic  or  tectonic  interpretations,  to  name  some  of  the  more  important  types. 


Figure  4.  Unified  a  priori  geophysical  model  of  Eurasia 

and  North  Africa.  Regionalization  in  blue  in  map 
at  upper  left  contributed  by  LLNL  and  in  red, 
contributed  by  LANE.  The  upper  right  map 
shows  an  a  priori  estimate  of  the  thickness  of  the 
crust,  and  the  map  at  lower  right  depicts  an 
estimate  of  the  P  velocity  at  the  top  of  the  mantle 
(suitable  for  predicting  Pn  velocities). 


UppfrHuetc  P-wav«  Vfrloeic.y 


As  mentioned  earlier,  different  data  types  result  from  waves  or  fields  that  sample  different  portions  of  the  earth  and 
do  so  with  different  weighting  or  averaging  over  the  regions  to  be  model  (expressed  as  different  resolution  kernels  in 
inverse  methods).  The  issue  for  developing  large-scale  3D  models  is  again  the  curse  of  dimensionality — 3D  models 
have  many  more  parameters  to  estimate  than  2D  or  ID  models.  The  larger  and  more  varied  the  volume  of  data 
brought  to  bear  on  the  problem,  the  more  certain  will  be  the  result  and  the  better  the  resolution  (within  limits  set  by 
the  distribution  of  stations  and  sources). 

Path  Coverage  and  Data  Quality 

Data  coverage  and  quality  issues  are  among  the  most  important  practical  considerations  when  contemplating  the 
construction  of  3D  models.  For  those  data  types  that  are  seismic  observables,  the  distribution  of  paths  sampling  the 
volume  of  the  Earth  is  not  under  our  control.  Events  are  not  uniformly  distributed,  but  highly  concentrated  along 
plate  boundaries.  Stations  are  not  uniformly  distributed  either,  because  of  the  fact  that  land  mass  constitutes  only 
30%  of  the  earth’s  surface,  and  many  regions  are  inaccessible  for  political  or  logistical  reasons. 

The  number  of  events  available  to  provide  data  for  model  building  is  further  constrained  by  quality  requirements. 
Especially  for  constraining  the  velocities  of  models  from  seismic  wave  travel-time  observations,  the  locations. 
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depths,  and  origin  times  of  events  have  to  be  known  accurately.  This  has  led  to  requirements  on  ground  truth 
information,  generally  designated  as  GTx,  where  “x”  specifies  the  uncertainty  in  the  location  of  the  event,  e.g.,  1 
km,  5  km,  or  25  km.  A  significant  issue  for  the  development  of  3D  models  is  that  the  number  of  events  with  the 
most  desirable  ground  truth  levels  (GT5  or  better)  is  in  the  low  thousands,  whereas  the  total  number  of  events 
reported  in  major  bulletins  is  in  the  many  hundreds  of  thousands.  The  development  of  high  quality  ground  truth 
information  remains  a  priority  for  model  development. 
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Figure  5.  Observed  travel-time  residuals  at  station  OBN  (Obninsk,  Russia)  against  predictions  from  the  1-D 
iasp91  model.  At  left,  the  measured  (color  coded)  residuals  against  iasp91  predictions  are  plotted  as 
points,  superimposed  on  a  surface  representing  the  difference  between  the  3-D  WENA  model 
prediction  and  the  iasp91  prediction.  Note  that  the  a  priori  WENA  prediction  correlates  well  with 
the  measured  residuals.  Summary  statistics  at  right  show  that  the  WENA  residuals  are  closer  to 
zero  mean  and  have  smaller  variance  than  the  iasp91  residuals. 

The  Inverse  Problem:  Objective  Functions  and  Inversion  Methods 

From  what  we  have  discussed  so  far,  we  see  that  the  3D  models  of  interest  to  the  monitoring  community  ultimately 
are  likely  to  be  global,  multi-resolution  models  with  considerable  detail  at  shallow  depths  in  certain  continental 
regions  and  less  detail  under  the  oceans  and  in  the  deeper  mantle  and  core.  Development  of  such  models  will  make 
extensive  use  of  a  priori  constraints  to  offset,  to  the  degree  possible,  the  relatively  insufficient  (though  still  very 
large)  quantity  of  data  available  to  estimate  model  parameters.  To  minimize  pipeline  complexity  and  maximize 
software  maintainability,  there  will  be  pressure  to  consolidate  predictive  capability  into  as  few  separate  models  as 
possible,  ideally  into  a  single  model.  The  model  or  models  that  emerge  will  be  multifaceted,  representing 
simultaneously  many  or  all  geophysical  parameters  of  interest  and  predicting  many  or  all  monitoring  observables. 
Correspondingly,  many  disparate  data  types  will  have  to  be  inverted  simultaneously  to  estimate  the  model 
parameters. 

This  inversion  problem  represents  a  significant  challenge.  A  number  of  factors  will  have  to  be  addressed 
simultaneously  in  deciding  a  suitable  inversion  technique:  the  mathematical  objective  functions  used  as  figures  of 
model  merit  (including  the  related  statistical  descriptions  of  prediction  residuals),  whether  parametric  or  waveform 
data  will  be  fit,  the  resolution  of  the  model,  the  computational  and  storage  requirements. 
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The  statistics  of  the  residuals  (Figure  2)  between  model  predictions  and  data  will  largely  drive  the  objective 
functions.  If  Gaussian  statistics  are  adequate,  then  least-squares  objectives  may  be  indicated.  If  correlated  residuals 
are  likely,  then  an  objective  function  with  suitable  covariance  structure  will  be  required.  If  non-Gaussian  or 
multimodal  statistics  are  likely,  then  a  grid-search  approach  might  be  required,  such  as  a  Markov-Chain  Monte 
Carlo  (MCMC)  method.  A  multiple  objective  approach  (next  section)  might  be  superior  when  several  disparate 
observables  are  predicted  by  the  same  model.  A  priori  constraints  on  the  models  may  suggest  Bayesian  objective 
functions. 

The  choice  of  inversion  engine  also  may  be  driven  by  the  desirability  of  performing  constrained  optimization. 
Physical  constraints  (realistic  ranges  of  velocities  and  densities,  or  rheological  conditions,  for  example)  may 
required  penalty  function,  barrier  function,  interior  point,  or  other  constrained  optimization  techniques. 

If  stochastic  models  are  desired,  as  may  happen  if  a  realistic  exploration  is  required  of  model  tradeoffs  and 
parameter  correlations,  then  a  sophisticated  grid  search  (MCMC,  simulated  annealing)  will  be  necessary.  These 
options  almost  surely  will  require  large-scale  computing  to  construct  3D  models. 

Development  of  even  deterministic  3D  models  with  global  coverage  will  require  large  amounts  of  computer  storage 
and  speed.  A  typical  model  might  be  hundreds  of  megabytes  or  even  larger.  If  fitting  full  waveforms,  or  some 
functionals  of  waveforms  (e.g.,  envelopes)  in  the  higher  frequency  bands  of  monitoring  interests  becomes  important, 
then  model  prediction  of  observables  (waveforms)  will  require  computers  with  hundreds  or  thousands  of  processors. 
If  stochastic  sampling  approaches  are  employed,  then  storage  for  thousands  of  models  will  be  required.  The 
inversion  problem  then  would  be  in  the  terascale  or  petascale  range. 

June  2007  3D  Model  Workshop 

On  June  6  and  7,  2007,  LLNL  hosted  a  workshop  on  the  future  of  3D  models  in  seismic  monitoring  and  hazard 
applications.  The  objective  of  the  workshop  was  to  bring  together  government,  industry,  and  academic  leaders  in  the 
seismic  monitoring  and  hazard  communities  to  explore  issues  surrounding  the  transition  to  3D  models.  Several 
trends  favoring  3D  models  are  converging:  the  continued  proliferation  of  digital  seismic  stations  globally,  the 
increasing  availability  of  data  from  these  stations,  the  decreasing  cost  of  computer  storage  and  processing  power, 
and  the  increasing  sophistication  of  modeling  techniques. 

While  it  is  not  yet  certain  that  the  use  of  3D  models  can  provide  significantly  better  monitoring  performance  than 
that  obtained  with  2D  and  2.5D  models,  there  are  indications  that  this  is  the  case,  especially  where  full  waveform 
predictions  are  required.  For  example,  the  influence  of  basins  on  phase  amplification  and  attenuation  at  the  local 
and  regional  scales  is  significant  in  event  discrimination  and  estimation  of  mechanism  and  explosion  yields. 
Examples  of  significant  off-azimuth  and  multipath  wave  propagation  effects  have  been  noted  in  complex  regions, 
which  can,  for  example,  bias  location  and  surface  wave  magnitude  estimates.  The  partition  of  energy  between  Sn 
and  Lg  phases,  crucial  to  discrimination,  is  strongly  affected  by  propagation  along  paths  that  contain  some  leg  of 
oceanic  crust.  Accurate  numerical  predictions  of  these  effects  remain  elusive,  but  almost  surely  will  involve  the  use 
of  3D  structure.  All  of  these  effects  are  likely  to  become  more  important  as  smaller  events  are  observed  at  ever 
higher  frequencies  and  closer  range. 

Whether  good  3D  models  can  be  developed  is  an  issue,  especially  at  resolutions  required  to  support  accurate  high- 
frequency  (above  e.g.  1  Hz)  numerical  waveform  predictions.  The  prospects  for  developing  models  with  increased 
resolution  are  improving  with  the  availability  of  data  from  more  stations,  and  innovative  techniques  for  exploiting 
those  data,  particularly  ambient  noise  tomography.  Improvement  in  the  number  and  distribution  of  stations  and 
availability  of  data  is  still  the  single  most  important  factor  in  constraining  3D  models. 

However,  significant  improvements  in  the  large-scale  exploitation  of  available  data  with  joint  inversion  of  multiple 
data  types,  multiple-objective  optimization  to  implement  joint  inversion,  and  sophisticated  adjoint  methods  also  play 
a  major  role.  New  adjoint  methods  allow  much  more  accurate  computation  of  the  sensitivity  of  waveforms  to 
complex  3D  structure.  Spherical  finite-difference  solvers  for  the  eikonal  wave  equation  should  provide  more 
accurate  predictions  of  travel-time  from  3D  models.  These  techniques  and  others  are  being  demonstrated  on  the 
local  and  regional  scales  with  models  of  the  San  Francisco  Bay  Area,  southern  California  and  the  entire  state  with 
unprecedented  resolution. 
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One  key  to  widespread  adoption  of  these  techniques  is  the  availability  of  relatively  inexpensive  high-performance 
computing.  Development  of  strategic  computing  to  petascale  levels  is  being  driven  by  the  Department  of  Energy 
through  its  stockpile  stewardship  program.  Commercial  high-performance  computing  is  not  far  behind  with 
terascale  computing  now  becoming  available  at  universities  and  in  the  private  sector.  Large-scale  computing  also  is 
a  requirement  for  accurate  nonlinear  simulations  of  the  explosion  source  that  will  use  the  best  available  3D 
geophysical  models  to  predict  regional  seismic  observables  using  a  domain  decomposition  approach.  Realistic  first- 
principles  calculations  of  shock  and  material  damage  around  the  source  are  a  promising  means  for  predicting 
explosion  signatures  in  regions  where  no  historic  nuclear  test  data  are  available. 

The  prospect  of  building  3D  models  as  a  community  project  similar  to  the  SCEC  Community  Modeling 
Environment  [SCEC,  2007]  was  raised  and  a  specific  proposal  was  presented  to  construct  two  models  (one  P,  one  S) 
linked  through  a  common  crust  and  lithosphere  discontinuity  structure.  The  intent  of  the  P  model  would  be  to 
predict  teleseismic  and  regional  P  travel  times  in  order  to  improve  event  location  estimates.  The  S  model  would  be 
constructed  from  surface-wave  dispersion  measurements  with  periods  of  20  seconds  and  greater.  If  s  purpose  would 
be  to  predict  Rayleigh  and  possibly  Love  wave  phase  and  group  velocities  at  regional  and  teleseismic  ranges  for  the 
purpose  of  estimating  Ms  and  moment  tensors.  The  models  both  were  proposed  to  be  hierarchical  and  multi¬ 
resolution  to  provide  increased  resolution  where  justified  by  data  coverage.  A  common  format  and  shared  structure 
would  allow  the  models  to  be  merged  into  a  single  model  at  some  point. 

The  overriding  consideration  in  development  of  such  a  model  pair  was  to  pick  a  practical  goal  as  a  starting  point  for 
a  collaborative  effort  with  sufficient  flexibility  to  be  extended  to  more  complex  models,  if  warranted.  There  also 
was  a  strong  requirement  to  develop  models  immediately  useful  for  monitoring  objectives. 

CONCLUSIONS  AND  RECOMMENDATIONS 


The  proposal  for  a  community  3D  model  came  at  the  very  end  of  the  June  workshop.  We  recommend  conducting  a 
follow-on  workshop  dedicated  to  completing  specifications  for  a  3D  model  for  monitoring  applications  and  to 
examining  practical  means  for  building  such  a  model.  The  June  workshop  made  clear  that  there  is  significant 
possible  synergy  in  techniques  for  constructing  models  among  the  diverse  test  ban,  hazard  and  global  earth  structure 
communities,  even  though  the  objectives,  scales  and  geographic  boundaries  of  the  models  they  develop  individually 
might  differ  substantially.  The  notion  of  constructing  a  multiresolution  community  model  should  be  examined  to 
determine  whether  a  common  set  of  modeling  objectives  can  be  found  to  provide  incentives  for  all  to  collaborate  on 
data,  techniques  and  computing  resources. 
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